AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Visual Question Answering Expert

# Visual Question Answering Expert

Llama 3.2 90B Vision Instruct
Llama 3.2-Vision is a multimodal large language model developed by Meta, supporting image and text input with text output, excelling in visual recognition, image reasoning, image captioning, and visual question answering tasks.
Image-to-Text Transformers Supports Multiple Languages
L
meta-llama
15.44k
337
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase